{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Numpy介绍](02.00-Introduction-to-NumPy.ipynb) | [目录](Index.ipynb) | [Numpy数组基础](02.02-The-Basics-Of-NumPy-Arrays.ipynb) >\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/wangyingsm/Python-Data-Science-Handbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Google Colaboratory\"></a>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Understanding Data Types in Python\n",
    "\n",
    "# 理解Python中的数据类型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Effective data-driven science and computation requires understanding how data is stored and manipulated.\n",
    "This section outlines and contrasts how arrays of data are handled in the Python language itself, and how NumPy improves on this.\n",
    "Understanding this difference is fundamental to understanding much of the material throughout the rest of the book.\n",
    "\n",
    "想要有效的掌握数据驱动科学和计算需要理解数据是如何存储和处理的。本节将描述和对比数组在Python语言中和在NumPy中是怎么处理的，NumPy是如何优化了这部分的内容。理解这个区别是理解本书后续内容的基础。\n",
    "\n",
    "> Users of Python are often drawn-in by its ease of use, one piece of which is dynamic typing.\n",
    "While a statically-typed language like C or Java requires each variable to be explicitly declared, a dynamically-typed language like Python skips this specification. For example, in C you might specify a particular operation as follows:\n",
    "\n",
    "Python的用户通常都是被它的易用性吸引来的，其中很重要一环就是动态类型。静态类型的语言，例如C或者Java，每个变量都需要明确声明，而动态类型语言如Python就略过了这个部分。例如，在C中，你可能会写如下的代码片段：\n",
    "\n",
    "```C\n",
    "int result = 0;\n",
    "for(int i=0; i<100; i++){\n",
    "    result += i;\n",
    "}\n",
    "```\n",
    "\n",
    "> While in Python the equivalent operation could be written this way:\n",
    "\n",
    "但是在Python当中，等效的代码如下：\n",
    "\n",
    "```python\n",
    "result = 0\n",
    "for i in range(100):\n",
    "    result += i\n",
    "```\n",
    "\n",
    "> Notice the main difference: in C, the data types of each variable are explicitly declared, while in Python the types are dynamically inferred. This means, for example, that we can assign any kind of data to any variable:\n",
    "\n",
    "注意其中主要的区别：在C当中，每个变量都需要显式声明，Python的类型是动态推断的。这意味着，我们可以给任何的变量赋值为任何类型的数据，例如：\n",
    "\n",
    "```python\n",
    "x = 4\n",
    "x = \"four\"\n",
    "```\n",
    "\n",
    "> Here we've switched the contents of ``x`` from an integer to a string. The same thing in C would lead (depending on compiler settings) to a compilation error or other unintented consequences:\n",
    "\n",
    "上面的例子中我们将`x`变量的内容从一个整数变成了一个字符串。如果你想在C语言中这样做，取决于不同的编译器，可能会导致一个编译错误或者其他无法预料的结果。\n",
    "\n",
    "```C\n",
    "int x = 4;\n",
    "x = \"four\";  // 编译错误\n",
    "```\n",
    "\n",
    "> This sort of flexibility is one piece that makes Python and other dynamically-typed languages convenient and easy to use.\n",
    "Understanding *how* this works is an important piece of learning to analyze data efficiently and effectively with Python.\n",
    "But what this type-flexibility also points to is the fact that Python variables are more than just their value; they also contain extra informatiinon about the type of the value. We'll explore this more in the sections that follow.\n",
    "\n",
    "这种灵活性提供了Python和其他动态类型语言在使用上的简易性。但是，理解这里面的*工作原理*对于在Python中高效准确的学习和分析数据是非常重要的。Python的这种类型灵活性，实际上是付出了额外的存储代价的，变量不仅仅存储了数据本身，还需要存储其相应的类型。我们会在本节接下来的部分继续讨论。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A Python Integer Is More Than Just an Integer\n",
    "\n",
    "## Python的整数不仅仅是一个整数\n",
    "\n",
    "> The standard Python implementation is written in C.\n",
    "This means that every Python object is simply a cleverly-disguised C structure, which contains not only its value, but other information as well. For example, when we define an integer in Python, such as ``x = 10000``, ``x`` is not just a \"raw\" integer. It's actually a pointer to a compound C structure, which contains several values.\n",
    "Looking through the Python 3.4 source code, we find that the integer (long) type definition effectively looks like this (once the C macros are expanded):\n",
    "\n",
    "标准的Python实现是使用C语言编写的。这意味着每个Python当中的对象都是一个伪装良好的C结构体，结构体内不仅仅包括它的值，还有其他的信息。例如，当我们在Python中定义了一个整数，比方说`x=10000`，`x`不仅仅是一个原始的整数，它在底层实际上是一个指向复杂C结构体的指针，里面含有若干个字段。当你查阅Python 3.4的源代码的时候，你会发现整数（实际上是长整形）的定义如下（我们将C语言中的宏定义展开后）：\n",
    "\n",
    "```C\n",
    "struct _longobject {\n",
    "    long ob_refcnt;\n",
    "    PyTypeObject *ob_type;\n",
    "    size_t ob_size;\n",
    "    long ob_digit[1];\n",
    "};\n",
    "```\n",
    "\n",
    "> A single integer in Python 3.4 actually contains four pieces:\n",
    "\n",
    "> - ``ob_refcnt``, a reference count that helps Python silently handle memory allocation and deallocation\n",
    "> - ``ob_type``, which encodes the type of the variable\n",
    "> - ``ob_size``, which specifies the size of the following data members\n",
    "> - ``ob_digit``, which contains the actual integer value that we expect the Python variable to represent.\n",
    "\n",
    "一个Python的整数实际上包含四个部分：\n",
    "\n",
    "- ``ob_refcnt``：引用计数器，Python用这个字段来进行内存分配和垃圾收集\n",
    "- ``ob_type``：变量类型的编码内容\n",
    "- ``ob_size``：表示下面的数据字段的长度\n",
    "- ``ob_digit``：真正的整数值存储在这个字段\n",
    "\n",
    "> This means that there is some overhead in storing an integer in Python as compared to an integer in a compiled language like C, as illustrated in the following figure:\n",
    "\n",
    "这意味着在Python中存储一个整数要比在像C这样的编译语言中存储一个整数要有损耗，就像下图展示的那样："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Integer Memory Layout](https://github.com/wangyingsm/Python-Data-Science-Handbook/raw/e044d370f852cd3639bbe45ebc2eb3e6f11c1e62/notebooks/figures/cint_vs_pyint.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Here ``PyObject_HEAD`` is the part of the structure containing the reference count, type code, and other pieces mentioned before.\n",
    "\n",
    "这里的`PyObject_HEAD`代表了前面的引用计数器、类型代码和数据长度的三个字段内容。\n",
    "\n",
    "> Notice the difference here: a C integer is essentially a label for a position in memory whose bytes encode an integer value.\n",
    "A Python integer is a pointer to a position in memory containing all the Python object information, including the bytes that contain the integer value.\n",
    "This extra information in the Python integer structure is what allows Python to be coded so freely and dynamically.\n",
    "All this additional information in Python types comes at a cost, however, which becomes especially apparent in structures that combine many of these objects.\n",
    "\n",
    "再次注意一下这里的区别：C的整数就是简单一个内存位置，这个位置上的固定长度的字节可以表示一个整数；Python中的一个整数是一个指向内存位置的指针，该内存位置包括Python需要表示一个整数的所有信息，其中最后固定长度的字节才真正存储这个整数。这些额外的信息提供了Python的灵活性和易用性。这些Python类型需要的额外信息是有额外损失的，特别是当有一个集合需要存储许多这种类型的数据时。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A Python List Is More Than Just a List\n",
    "\n",
    "## Python的列表不仅仅是一个列表\n",
    "\n",
    "> Let's consider now what happens when we use a Python data structure that holds many Python objects.\n",
    "The standard mutable multi-element container in Python is the list.\n",
    "We can create a list of integers as follows:\n",
    "\n",
    "现在我们继续考虑当我们使用Python的数据结构来存储许多这样的Python对象时的情况。Python中标准的可变多元素的容器集合就是列表。我们按如下的方式创建一个整数的列表："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L = list(range(10))\n",
    "L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "int"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(L[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Or, similarly, a list of strings:\n",
    "\n",
    "又或者，类似的，字符串的列表："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L2 = [str(c) for c in L] # 列表解析\n",
    "L2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "str"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(L2[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Because of Python's dynamic typing, we can even create heterogeneous lists:\n",
    "\n",
    "因为Python是动态类型，我们甚至可以创建不同类型元素的列表："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[bool, str, float, int]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L3 = [True, \"2\", 3.0, 4]\n",
    "[type(item) for item in L3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> But this flexibility comes at a cost: to allow these flexible types, each item in the list must contain its own type info, reference count, and other information–that is, each item is a complete Python object.\n",
    "In the special case that all variables are of the same type, much of this information is redundant: it can be much more efficient to store data in a fixed-type array.\n",
    "The difference between a dynamic-type list and a fixed-type (NumPy-style) array is illustrated in the following figure:\n",
    "\n",
    "这种灵活性是要付出代价的：要让列表能够容纳不同的类型，每个列表中的元素都必须带有自己的类型信息、引用计数器和其他的信息，一句话，里面的每个元素都是一个完整的Python的对象。如果在所有的元素都是同一种类型的情况下，这里面绝大部分的信息都是冗余的：如果我们能将数据存储在一个固定类型的数组中，显然会更加高效。下图展示了动态类型的列表和固定类型的数组（NumPy实现的）的区别："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Array Memory Layout](https://github.com/wangyingsm/Python-Data-Science-Handbook/raw/e044d370f852cd3639bbe45ebc2eb3e6f11c1e62/notebooks/figures/array_vs_list.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> At the implementation level, the array essentially contains a single pointer to one contiguous block of data.\n",
    "The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object like the Python integer we saw earlier.\n",
    "Again, the advantage of the list is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type.\n",
    "Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.\n",
    "\n",
    "从底层实现上看，数组仅仅包含一个指针指向一块连续的内存空间。而Python列表，含有一个指针指向一块连续的指针内存空间，里面的每个指针再指向内存中每个独立的Python对象，如我们前面看到的整数。列表的优势在于灵活：因为每个元素都是完整的Python的类型对象结构，包含了数据和类型信息，因此列表可以存储任何类型的数据。NumPy使用的固定类型的数组缺少这种灵活性，但是对于存储和操作数据会高效许多。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fixed-Type Arrays in Python\n",
    "\n",
    "## Python的固定类型数组\n",
    "\n",
    "> Python offers several different options for storing data in efficient, fixed-type data buffers.\n",
    "The built-in ``array`` module (available since Python 3.3) can be used to create dense arrays of a uniform type:\n",
    "\n",
    "Python提供了许多不同的选择能让你高效的存储数据，使用固定类型数据。內建的`array`模块（从Python 3.3开始提供）可以用来创建同一类型的数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import array\n",
    "L = list(range(10))\n",
    "A = array.array('i', L)\n",
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Here ``'i'`` is a type code indicating the contents are integers.\n",
    "\n",
    "这里的`i`是表示数据类型是整数的类型代码。\n",
    "\n",
    "> Much more useful, however, is the ``ndarray`` object of the NumPy package.\n",
    "While Python's ``array`` object provides efficient storage of array-based data, NumPy adds to this efficient *operations* on that data.\n",
    "We will explore these operations in later sections; here we'll demonstrate several ways of creating a NumPy array.\n",
    "\n",
    "更常用的是`ndarray`对象，由NumPy包提供。虽然Python的`array`提供了数组的高效存储，NumPy更加提供了数组的高效*运算*。我们会在后续小节中陆续介绍这些操作；这里我们首先介绍创建NumPy数组的集中方式。\n",
    "\n",
    "> We'll start with the standard NumPy import, under the alias ``np``:\n",
    "\n",
    "当然最开始要做的是将NumPy包载入，惯例上提供别名`np`："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating Arrays from Python Lists\n",
    "\n",
    "## 使用Python列表创建数组\n",
    "\n",
    "> First, we can use ``np.array`` to create arrays from Python lists:\n",
    "\n",
    "首先，我们可以使用`np.array`来将一个Python列表变成一个数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 4, 2, 5, 3])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 整数数组:\n",
    "np.array([1, 4, 2, 5, 3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type.\n",
    "If types do not match, NumPy will upcast if possible (here, integers are up-cast to floating point):\n",
    "\n",
    "记住和Python列表不同，NumPy数组只能含有同一种类型的数据。如果类型不一样，NumPy会尝试向上扩展类型（下面例子中会将整数向上扩展为浮点数）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3.14, 4.  , 2.  , 3.  ])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array([3.14, 4, 2, 3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> If we want to explicitly set the data type of the resulting array, we can use the ``dtype`` keyword:\n",
    "\n",
    "如果你需要明确指定数据的类型，你可以使用`dtype`关键字参数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1., 2., 3., 4.], dtype=float32)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array([1, 2, 3, 4], dtype='float32')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Finally, unlike Python lists, NumPy arrays can explicitly be multi-dimensional; here's one way of initializing a multidimensional array using a list of lists:\n",
    "\n",
    "最后，不同于Python的列表，NumPy的数组可以明确表示为多维；下面例子是一个使用列表的列表来创建二维数组的方法："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2, 3, 4],\n",
       "       [4, 5, 6],\n",
       "       [6, 7, 8]])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 更准确的说，应该是生成器的列表，列表解析中有三个range生成器\n",
    "# 分别是range(2, 5), range(4, 7) 和 range(6, 9)\n",
    "np.array([range(i, i + 3) for i in [2, 4, 6]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> The inner lists are treated as rows of the resulting two-dimensional array.\n",
    "\n",
    "内部的列表作为二维数组的行。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating Arrays from Scratch\n",
    "\n",
    "## 从头开始创建数组\n",
    "\n",
    "> Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy.\n",
    "Here are several examples:\n",
    "\n",
    "使用NumPy的方法从头创建数组会更加高效，特别对于大型数组来说。下面有几个例子："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# zeros将数组元素都填充为0，10是数组长度\n",
    "np.zeros(10, dtype=int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1., 1., 1., 1., 1.],\n",
       "       [1., 1., 1., 1., 1.],\n",
       "       [1., 1., 1., 1., 1.]])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# ones将数组元素都填充为1，(3, 5)是数组的维度说明，表明数组是二维的3行5列\n",
    "np.ones((3, 5), dtype=float)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[3.14, 3.14, 3.14, 3.14, 3.14],\n",
       "       [3.14, 3.14, 3.14, 3.14, 3.14],\n",
       "       [3.14, 3.14, 3.14, 3.14, 3.14]])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# full将数组元素都填充为参数值3.14，(3, 5)是数组的维度说明，表明数组是二维的3行5列\n",
    "np.full((3, 5), 3.14)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# arange类似range，创建一段序列值\n",
    "# 起始值是0（包含），结束值是20（不包含），步长为2\n",
    "np.arange(0, 20, 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.  , 0.25, 0.5 , 0.75, 1.  ])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# linspace创建一段序列值，其中元素按照区域进行线性（平均）划分\n",
    "# 起始值是0（包含），结束值是1（包含），共5个元素\n",
    "np.linspace(0, 1, 5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.28957547, 0.80872794, 0.36451325],\n",
       "       [0.30178461, 0.13998063, 0.21693246],\n",
       "       [0.81413802, 0.26299406, 0.53082583]])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# random.random随机分布创建数组\n",
    "# 随机值范围为[0, 1)，(3, 3)是维度说明，二维数组3行3列\n",
    "np.random.random((3, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.51772646,  0.39614948, -0.10634696],\n",
       "       [ 0.25671348,  0.00732722,  0.37783601],\n",
       "       [ 0.68446945,  0.15926039, -0.70744073]])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# random.normal正态分布创建数组\n",
    "# 均值0，标准差1，(3, 3)是维度说明，二维数组3行3列\n",
    "np.random.normal(0, 1, (3, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2, 3, 4],\n",
       "       [5, 7, 8],\n",
       "       [0, 5, 0]])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# random.randint随机整数创建数组，随机数范围[0, 10)\n",
    "np.random.randint(0, 10, (3, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  0.,  0.],\n",
       "       [ 0.,  1.,  0.],\n",
       "       [ 0.,  0.,  1.]])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 3x3的单位矩阵数组\n",
    "np.eye(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5.74020278e+180, 4.00193173e-322, 4.66594353e-310])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# empty创建一个未初始化的数组，数组元素的值保持为原有的内存空间值\n",
    "np.empty(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NumPy Standard Data Types\n",
    "\n",
    "## NumPy标准数据类型\n",
    "\n",
    "> NumPy arrays contain values of a single type, so it is important to have detailed knowledge of those types and their limitations.\n",
    "Because NumPy is built in C, the types will be familiar to users of C, Fortran, and other related languages.\n",
    "\n",
    "NumPy数组仅包含一种类型数据，因此它的类型系统和Python也有所区别，因为对于每一种NumPy类型，都需要更详细的类型信息和限制。因为NumPy是使用C构建的，它的类型系统对于C、Fortran的用户来说不会陌生。\n",
    "\n",
    "> The standard NumPy data types are listed in the following table.\n",
    "Note that when constructing an array, they can be specified using a string:\n",
    "\n",
    "标准NumPy数据类型见下表。正如上面介绍的，当我们创建数组的时候，我们可以将`dtype`参数指定为下面类型的字符串名称来指定数组的数据类型。\n",
    "\n",
    "```python\n",
    "np.zeros(10, dtype='int16')\n",
    "```\n",
    "\n",
    "> Or using the associated NumPy object:\n",
    "\n",
    "也可以将`dtype`指定为对应的NumPy对象：\n",
    "\n",
    "\n",
    "```python\n",
    "np.zeros(10, dtype=np.int16)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "| Data type\t    | Description |\n",
    "|---------------|-------------|\n",
    "| ``bool_``     | 布尔(True 或 False) 一个字节 |\n",
    "| ``int_``      | 默认整数类型 (类似C的``long``; 通常可以是``int64``或``int32``)| \n",
    "| ``intc``      | 类似C的``int`` (通常可以是``int32``或``int64``)| \n",
    "| ``intp``      | 用于索引值的整数(类似C的``ssize_t``; 通常可以是``int32``或``int64``)| \n",
    "| ``int8``      | 整数，1字节 (-128 ～ 127)| \n",
    "| ``int16``     | 整数，2字节 (-32768 ～ 32767)|\n",
    "| ``int32``     | 整数，4字节 (-2147483648 ～ 2147483647)|\n",
    "| ``int64``     | 整数，8字节 (-9223372036854775808 ～ 9223372036854775807)| \n",
    "| ``uint8``     | 字节 (0 ～ 255)| \n",
    "| ``uint16``    | 无符号整数 (0 ～ 65535)| \n",
    "| ``uint32``    | 无符号整数 (0 ～ 4294967295)| \n",
    "| ``uint64``    | 无符号整数 (0 ～ 18446744073709551615)| \n",
    "| ``float_``    | `float64`的简写 | \n",
    "| ``float16``   | 半精度浮点数: 1比特符号位, 5比特指数位, 10比特尾数位 | \n",
    "| ``float32``   | 单精度浮点数: 1比特符号位, 8比特指数位, 23比特尾数位| \n",
    "| ``float64``   | 双精度浮点数: 1比特符号位, 11比特指数位, 52比特尾数位| \n",
    "| ``complex_``  | `complex128`的简写 | \n",
    "| ``complex64`` | 复数, 由2个单精度浮点数组成 | \n",
    "| ``complex128``| 复数, 由2个双精度浮点数组成 | "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> More advanced type specification is possible, such as specifying big or little endian numbers; for more information, refer to the [NumPy documentation](http://numpy.org/).\n",
    "NumPy also supports compound data types, which will be covered in [Structured Data: NumPy's Structured Arrays](02.09-Structured-Data-NumPy.ipynb).\n",
    "\n",
    "还有更多的高级的类型声明，比如指定大尾或小尾表示；需要获得更多内容，请查阅[NumPy在线文档](http://numpy.org/)。NumPy也支持复合数据类型，这部分我们将在[结构化数据：NumPy里的结构化数组](02.09-Structured-Data-NumPy.ipynb)中进行介绍"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Numpy介绍](02.00-Introduction-to-NumPy.ipynb) | [目录](Index.ipynb) | [Numpy数组基础](02.02-The-Basics-Of-NumPy-Arrays.ipynb) >\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/wangyingsm/Python-Data-Science-Handbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Google Colaboratory\"></a>\n"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
